Improved modelling of speech dynamics using non-linear formant trajectories for HMM-based speech synthesis

نویسندگان

  • Hongwei Hu
  • Martin J. Russell
چکیده

This paper describes the use of non-linear formant trajectories to model speech dynamics. The performance of the non-linear formant dynamics model is evaluated using HMM-based speech synthesis experiments, in which the 12 dimensional parallel formant synthesiser control parameters and their time derivatives are used as the feature vectors in the HMM. Two types of formant synthesiser control parameters, named piecewise constant and smooth trajectory parameters, are used to drive the classic parallel formant synthesiser. The quality of the synthetic speech is assessed using three kinds of subjective tests. This paper shows that the non-linear formant dynamics model can improve the performance of HMM-based speech synthesis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards an improved model of dynamics for speech recognition and synthesis

This thesis describes the research on the use of non-linear formant trajectories to model speech dynamics under the framework of a multiple-level segmental hidden Markov model (MSHMM). The particular type of intermediate-layer model investigated in this study is based on the 12-dimensional parallel formant synthesiser (PFS) control parameters, which can be directly used to synthesise speech wit...

متن کامل

Speech recognition using non-linear trajectories in a formant-based articulatory layer of a multiple-level segmental HMM

This paper describes how non-linear formant trajectories, based on ‘trajectory HMM’ proposed by Tokuda et al., can be exploited under the framework of multiple-level segmental HMMs. In the resultant model, named a non-linear/linear multiple-level segmental HMM, speech dynamics are modeled as non-linear smooth trajectories in the formant-based intermediate layer. These formant trajectories are m...

متن کامل

Analysis, modelling and synthesis of formants of British, American and Australian accents

The formant space of three major English accents namely British, American and Australian are modelled and used for accent conversion. Accent synthesis, through modification of the acoustic parameters of speech, provides a means for assessing the perceptual contribution of each parameter on conveying an accent. An improved method based on a linear prediction (LP) model feature analysis and a 2-D...

متن کامل

Speech enhancement based on hidden Markov model using sparse code shrinkage

This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...

متن کامل

Formant-tracking Linear Prediction Models for Speech Processing in Noisy Enviroments

This paper presents a formant-tracking method for estimation of the time-varying trajectories of a linear prediction (LP) model of speech in noise. The main focus of this work is on the modelling of the non-stationary temporal trajectories of the formants of speech for improved LP model estimation in noise. The proposed approach provides a systematic framework for modelling the inter-frame corr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010